[2403.18814] Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Source: Arxiv Published on Thursday, March 28 2024
Related articles and keywords:   Gem - Mining
We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i.e., high-resolution ...
To view this article Click here.